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Abstract. We apply statistical mechanics to an inverse problem of linear 
mapping to investigate the physics of the irreversible compression. We use the 
replica symmetry breaking (RSB) technique with a toy model to demonstrate the 
Shannon's result. The rate distortion function, which is widely known as the 
theoretical limit of the compression with a fidelity criterion, is derived using the 
Parisi one step RSB scheme. The bound can not be achieved in the sparsely- 
connected systems, where suboptimal solutions dominate the capacity. 

Statistical physics and information science may have been expected to be directed 
towards common objectives since Shannon formulated an information theory based on 
the concept of entropy. However, envisaging how this actually happened would have 
been difficult. The situation has greatly been changed since the field of disordered 
statistical systems was maturely established pp. The areas where these relations are 
particularly strong are Shannon's theory [2] and the replica theory on classical spin 
systems with quenched disorder [3j . Triggered by the work of Sourlas ^ , these links 
have recently been examined in the area of error corrections |SlE|i network information 
theory |Zj , and turbo decoding [H| . Recent results of these topics are mostly derived 
using the replica trick. 

However, the research in the cross-disciplinary field so far can be categorized 
as a so-called 'zero distortion' decoding scheme in terms of information theory: the 
system requires perfect reproduction of the input alphabets Here, the same spin 
glass techniques should be useful to describe the physics of systems with a fidelity 
criterion; i.e., a certain degree of information distortion is assumed when reproducing 
the alphabets. This framework is called the rate distortion theory |^^|. Though 
processing information requires regarding the concept of distortions practically, where 
input alphabets are mostly represented by continuous variables, statistical physics 
only employs a few approaches based on highly modified perceptrons 

In this paper, we introduce a simplified model that achieves the optimality, only 
using parity checks Hke the Gallager's code We, then, can easily see how 

information distortion can be handled by the concepts of statistical physics. More 
specifically, we study the inverse problem of a Sourlas-type decoding problem by using 
the framework of replica symmetry breaking (RSB) of diluted disordered systems ■ 
According to our analysis, this toy model provides an optimal compression scheme for 
an arbitrary distortion level, though the encoding procedure remains an NP-complete 
problem without any practical encoders at the moment. 

The paper is organized as follows. We first review the concept of the rate 
distortion theory as well as the main results related to our purpose. We then introduce 
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a toy model. Finally we obtain consistent results with information theory. Detailed 
derivations will be reported elsewhere. 

We start by defining the concepts of the rate distortion theory and stating the 
simplest version of the main result. Let J be a discrete random variable with alphabet 
J. Assume that we have a source that produces a sequence Ji, J2, ■ • ■ , Jm, where each 
symbol is randomly drawn from a distribution. We will assume that the alphabet 
is finit. Throughout this paper we use vector notation to represent sequences for 
convenience of explanation: J = (Ji, J2, • ■ • , Ja/)"^ G J^'^ ■ Here, the encoder describes 
the source sequence J S j'^' by a codeword $, = f{J) G . The decoder represents 
J by an estimate J = g{^) E j'^^ , as illustrated in Figure^ Note that M represents 
the length of a source sequence, while N represents the length of a codeword. In this 
case, the rate is defined by i? = N/M. Note that the relation N < M always holds 
when a compression is considered; therefore, R < 1 also holds. 
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Figure 1. Rate distortion encoder and decoder 

A distortion function is a mapping d : J ^ J ^ from the set of source 
alphabet-reproduction alphabet pairs into the set of non-negative real numbers. 
Intuitively, the distortion d{J, J) is a measure of the cost of representing the symbol 
J by the symbol J. This definition is quite general. In most cases, however, the 
reproduction alphabet J is the same as the source alphabet J. Hereafter, we set 
J — J and the following distortion measure is adopted as the fidelity criterion; the 
Ramming distortion is given by 

f for J = J 

= |l 'o,J,J- 

which results in a probable error distortion, since the relation E[d{J, J)] = P[J ^ J] 
holds, where E[-] represents the expectation and P[-] the probability of its argument. 
The distortion measure is so far defined on a symbol-by-symbol basis. We extend the 
definition to sequences. The distortion between sequences J, J G ^7*^ is defined 
by d{J,J) = {i/M)J2jLid{Jj,Jj)- Therefore, the distortion for a sequence is 
the average distortion per symbol of the elements of the sequence. The distortion 
associated with the code is defined a.s D = E[d{J, J)], where the expectation is with 
respect to the probability distribution on J. A rate distortion pair (i?, D) should be 
achiebable if a sequence of rate distortion codes (/, g) exist with E[d{J,J)] < D in 
the limit M ^ oo. Moreover, the closure of the set of achievable rate distortion pairs 
is called the rate distortion region for a source. Finally, we can define a function to 
describe the boundary; the rate distortion function R{D) is the infimum of rates i?, 
so that (i?, D) is in the rate distortion region of the source for a given distortion D. 

As in we restrict ourselves to a binary source J with a Hamming distortion 
measure for simplicity. We assume that binary alphabets are drawn randomly, i.e., the 
source is not biased to rule out the possiblity of compression due to redundancy. We 
now find the description rate R{D) required to describe the source with an expected 
proportion of errors less than or equal to D. In this simplified case, according to 
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Shannon, the boundary can be written as foUows; the rate distortion function for a 
binary source with Hamming distortion is given by 

{l-h2{D) for < £» < 1/2 

R{D) ^{ ' r " > (2) 

^ ^ I for 1/2 < Z? ' ^ ' 

where /i2(-) represents the binary entropy function. 

Next we introduce a simphfied model for the lossy compression. We use the 
inverse problem of Sourlas-type decoding to realize the optimal encoding scheme 0] , a 
variation of which has recently been investigated by information theorists 14'. As in 
the previous paragraphs, we assume that binary alphabets are drawn randomly from a 
non-biased source and that the Hamming distortion measure is selected for the fidelity 
criterion. 

We take the Boolean representation of the binary alphabet i.e., we set 
J ~ {0, 1}. We also set X = {0, 1} to represent the codewords throughout the rest 
of this paper. Let J be an M-bit source sequence, ^ an TV-bit codeword, and J an 
M-bit reproduction sequence. Here, the encoding problem can be written as follows. 
Given a distortion D and a randomly-constructed Boolean matrix A of dimensionality 
M X TV, we find the A^-bit codeword sequence ^, which satisfies 

J = A| (mod 2) , (3) 

where the fidelity criterion D — E[d{J,J)] holds, according to every M-bit source 
sequence J. Note that we applied modulo 2 arithmetics for the additive operations in 

In our framework, decoding will just be a linear mapping J = A^, while encoding 
remains an NP-complete problem. 

Kabashima and Saad recently expanded on the work of Sourlas, which focused 
on the zero-rate limit, to an arbitrary-rate case [S]. We follow their construction of 
the matrix A, so we can treat non-trivial situations. Let the Boolean matrix A be 
characterized by K ones per row and C per column. The finite, and usually small, 
numbers K and C define a particular code. The rate of our codes can be set to an 
arbitrary value by selecting the combination of K and C. We also use K and C as 
control parameters to define the rate R ~ K/C . If the value of K is small, i.e., the 
relation K <^ N holds, the Boolean matrix A results in a very sparse matrix. By 
contrast, when we consider densely constructed cases, K must be extensively big and 
have a value of 0{N). We can also assume that K is not 0(1) but K <^ N holds. 

The similarity between codes of this type and Ising spin systems was first 
pointed out by Sourlas, who formulated the mapping of a code onto an Ising 
spin system Hamiltonian in the context of error correction To facilitate the 
current investigation, we first map the problem to that of an Ising model with finite 
connectivity following Sourlasfmethod. We use the Ising representation {1,-1} of the 
alphabet J and X rather than the Boolean one {0, 1}; the elements of the source 
J and the codeword sequences ^ are rewritten in Ising values, and the reproduction 
sequence J is generated by taking products of the relevant binary codeword sequence 
elements in the Ising representation = nie5(/i) Here, we denote the set of 
codeword indexes i that participate in the message index /i by S{ijl) = {i|a^i = 1} 
with A = {a^i). Therefore, chosen Vs correspond to the ones per row, producing a 
Ising version of J. Note that the additive operation in the Boolean representation is 
translated into the multiplication in the Ising one. Hereafter, we set Jj,Jj,S^i — ±1 
while we do not change the notations for simplicity. As we use statistical-mechanics 
techniques, we consider the source and codeword-sequence dimensionality (M and A^, 
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respectively) to be infinite, keeping the rate R = N/M finite. To explore the system's 
capabilities, we examine the Hamiltonian: 

M 

H{S\J)^Y.^[S\J,], (4) 

with 

G[s\j^]=-j^ n 

where we have introduced the dynamical variable Si to find the optimal value of ^i, 
and G'[S'| J;^] denotes the local connectivity of a random hypergraph neighboring the 
message bit |15j . In addition, we now introduce the connectivity tensor Q satisfying 
the relation: 

M 

E n E Q{il,---,iK)S'ii,- ■ ■ ,S.ij^ , (6) 

for any configuration of S. Elements of the connectivity tensor Q(i-^^...,ii^) take the value 
one if the corresponding indices of codeword bits are chosen (i.e., if all corresponding 
indices of the matrix A are one) and zero otherwise; C ones per i index represent the 
system's degree of connectivity. 

For calculating the partition function Z{J) = Trjs} exp[— /3?^(S'| J)], we apply 
the replica method following the calculation of Kabashima and Saad . To calculate 
replica free energy, we have to calculate the annealed average of the n-th power of the 
partition function by preparing n replicas. Here we introduce the inverse temperature 
/3, which can be interpreted as a measure of the system's sensitivity to distortions. 
Although larger values of (3 seem to be preferable to realize smaller reproduction 
errors, taking the limit (3 ^ oo fails to provide the optimal solution. This is a direct 
consequence of the sytem's irreversibility. As we see in the following calculation, 
the optimal value of j3 is naturally determined when the consistency of the replica 
symmetry breaking scheme is considered We use integral representations of 

the Dirac 5 function to enforce the restriction, C bonds per index, on Q |16j : 




/o 2^ 

giving rise to a set of order parameters 

1 ^ 

gaA-,7 = ]^E^^^"'^f •■•^7, (8) 

1=1 

where a, /3, • • • , 7 represent replica indices, and the average over J is taken with respect 
to the probability distribution: 

P[J^^ = \5{J^. - 1) + \5{J,. + 1) (9) 

as we consider the non-biased source sequences for simplicity. Assuming the replica 
symmetry, we use a different representation for the order parameters and the related 
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coniugate variables [T^ : 



(10) 
(11) 



where q = [{K - 1)INC]^/^ and q = [{K - 1)1]-^/''^ [NC]^^-^'>^'^ are normalization 
constants, and Tr{x) and 7r(a;) represent probability distributions related to the 
integration variables. Here I denotes the number of related replica indices. Throughout 
this paper, integrals with unspecified limits denote integrals over the range of 
(— oo, +oo). We then obtain an expression for the free energy per source bit expressed 
in terms of the probability distributions n{x) and Tr(a;): 



jjiilnZiJ))) 
In cosh (3 

' K 

'Y\_'n{xi)dxi 

.1=1 



K 



In j 1 + tanh (ij JJ^ tanh [3xi 



1=1 



K I ■n{x)dx I Tt{x)dx ln(l + tanh/3a; tanh/3x) 



C_ 
K 



W_'^{xi)d'. 



.1=1 



In 



c 



T^TsWil + St'AnliPxi] 



1=1 



(12) 



where ((• • •)) denotes the average over quenched randomness of J, and also Q. The 
saddle point equations with respect to probability distributions provide a set of 
relations between i:(x) and %(x): 



7r(x) 



Ti{x) 



W_'n:{xi)dxi 

1=1 
K 

'Y\_'n:{xi)dxi 
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t C-l N 

^ 1=1 J 



(13) 



1=1 



K-1 



tanh ^ j tanh/? J tanh/Jxj 



1=1 



(14) 



By using the result obtained for the free energy, we can easily perform further 
straightforward calculations to find all the other observable thermodynamical 
quantities, including internal energy: 



(15) 



which records reproduction errors. Therefore, in terms of the considered replica 
symmetric ansatz, a complete solution of the problem seems to be easily obtainable; 
unfortunately, it is not. 

This set of equations (|13|) and (|14|l may be solved numerically for general P, K, 
and C. However, there exists an analytical solution of this equations. We first consider 
this case. Two dominant solutions emerge that correspond to the paramagnetic and 
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the spin glass phases. The paramagnetic solution, which is also valid for general /3, 
K, and C, is in the form of ti{x) = 5{x) and tt — 5{x)] it has the lowest possible free 
energy per bit /para = — 1, although its entropy spara = (i? — 1) In 2 is positive only 
for R > 1. It means that the true solution must be somewhere beyond the replica 
symmetric ansatz. As a first step, which is called the one step replica symmetry 
breaking (RSB), n replicas are usually divided into n/m groups, each containing m 
replicas. Pathological aspects due to the replica symmetry may be avoided making 
use of the newly-defined freedom m. Actually, this one step RSB scheme is considered 
to provide the exact solutions when the random energy model limit is considered J7] , 
while our analysis is not restricted to this case so far. 

The spin glass solution can be calculated for both the replica symmetric and the 
one step RSB ansatz. The former reduces to the paramagnetic solution (/rs = /para), 
which is unphysical for i? < 1, while the latter yields 7riRSB(a;) = 5{x), 7riRSB(^) = 
5{x) with m = (3g{R)/(3 and Pg obtained from the root of the equation enforcing the 
non-negative replica symmetric entropy 

srs — lncosh/3g — (3g tanh/3g + i?ln2 = , (16) 

with a free energy 

1 R 

/iRSB = --5-lncosh/3g - — ln2 . (17) 

Pg Pg 

The simple expression (I17II is derived analytically without using any approximations. 
However, the stability of the solution must be taken into account when considering 
the validity. 

Since the target bit of the estimation in this model is J{i-^....^ij^) and its estimator 
the product Si-^ ■ ■ ■ Si^ , a performance measure for the information corruption could 
be the per-bond energy e. According to the one step RSB framework, the lowest free 
energy can be calculated from the probability distributions 7riRSB(a;) and 7riRSB(i) 
satisfying the saddle point equations (|13|l and p4(l at the characteristic inverse 
temperature /3g, when the replica symmetric entropy srs disappears. Therefore, /irsb 
equals eiRSB- Let the Hamming distortion be our fidelity criterion. The distortion D 
associated with this code is given by the fraction of the free energies that arise in the 
spin glass phase: 

„ fiRSB - Jrs 1 - tanh/5g 

917 — I ^ 9 • 

2|/fls| 2 

Here, we substitute the spin glass solutions into the expression, making use of the 
fact that the replica symmetric entropy spg disappears at a consistent Pg, which is 
determined by (|16|l . Using IjKil) and H18|l . simple algebra gives the relation between 
the rate R — N/M and the distortion D in the form 

R=l-h2iD), (19) 

which coincides with the rate distortion function in the Shannon's theorem. We do 
not define any non-linear mappings in the decoding stage but we implicitly do in 
the encoding stage. This situation is due to the duality between channel coding 
and lossy compression; the channel capacity can be achieved using linear encoders. 
Furthermore, we do not observe any first-order jumps between analytical solutions. 
Recently, we have seen that many approaches to the family of codes, characterized by 
the linear encoding operations, result in a quite different picture; the optimal boundary 
is constructed in the random energy model limit and is well captured by the concept 
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of a first-order jump. Our analysis of this model, viewed as a kind of inverse problem, 
provides an exception. 

We will now investigate the possiblity of the other solutions satisfying and 
H14|l in the case of finite K and C. Since the saddle point equations appear difficult for 
analytical arguments, we resort to numerical evaluations representing the probability 
distributions 7riRSB(a;) and •iTiRSB(a;) by up to 10'^ bin models and carrying out the 
integrations by using Monte Carlo methods. Note that the characteristic inverse 
temperature (3g is also evaluated numerically by using 1)12(1 and H15|l . We firstly 
calculate the entropy numerically, following the basic relation f — e — f3~^s. Then 
we choose the proper value of f3 which provides s — 0. We set K — 2 and selected 
various values of C to demonstrate the performance of stable solutions. The numerical 
results obtained by the one step RSB senario show suboptimal properties [Figure Ej. 
This strongly implies that the analytical solution is not the only stable solution. 
Furthermore, there has been recent works on the one-step RSB solution of the model 
considered in this paper. The stability of the solution is well examined for some value 
of and C [m. 




Figure 2. Numerically-constructed stable solutions: Stable solutions of lliji and 
I14i for the finite values of K and L are calculated by using Monte Carlo methods. 
We use 10^ bin models to approximate the probability distributions TTif^gB (x) and 
7rif{^SB(^)i starting from various initial conditions. The distributions converge 
to the continuous ones, giving suboptimal performance. (o) K = 2 and 
L = 3,4, ■••,12 ; Solid line indicates the rate distortion function R{D). Inset: 
Snapshots of the distributions, where L = 3 and /3g = 2.35. 

In this paper two points should be noted. Firstly, we find the consistency between 
the Shannon's rate distortion theory and the Parisi's one step RSB scheme. Secondly, 
we confirm that the analytical solution, which is consistent with the Shannon's result, 
can not be stable in the sparsely-connected systems. In case of sparse models, one 
might find a polynominal-time algorithm which calculates the suboptimal solutions. 
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providing a practical method of lossy compression. We are currently working on the 
verification. 
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